PKUSUMSUM : A Java Platform for Multilingual Document Summarization
نویسندگان
چکیده
PKUSUMSUM is a Java platform for multilingual document summarization, and it supports multiple languages, integrates 10 automatic summarization methods, and tackles three typical summarization tasks. The summarization platform has been released and users can easily use and update it. In this paper, we make a brief description of the characteristics, the summarization methods, and the evaluation results of the platform, and also compare PKUSUMSUM with other summarization toolkits.
منابع مشابه
A Platform for Multilingual News Summarization
We have developed a multilingual version of Columbia Newsblaster as a testbed for multilingual multi-document summarization. The system collects, clusters, and summarizes news documents from sources all over the world daily. It crawls news sites in many different countries, written in different languages, extracts the news text from the HTML pages, uses a variety of methods to translate the doc...
متن کاملMulti-document multilingual summarization corpus preparation, Part 1: Arabic, English, Greek, Chinese, Romanian
This document overviews the strategy, effort and aftermath of the MultiLing 2013 multilingual summarization data collection. We describe how the Data Contributors of MultiLing collected and generated a multilingual multi-document summarization corpus on 10 different languages: Arabic, Chinese, Czech, English, French, Greek, Hebrew, Hindi, Romanian and Spanish. We discuss the rationale behind th...
متن کاملMultiLing 2013 MultiLing 2013: Multilingual Multi-document Summarization
This document overviews the strategy, effort and aftermath of the MultiLing 2013 multilingual summarization data collection. We describe how the Data Contributors of MultiLing collected and generated a multilingual multi-document summarization corpus on 10 different languages: Arabic, Chinese, Czech, English, French, Greek, Hebrew, Hindi, Romanian and Spanish. We discuss the rationale behind th...
متن کاملMulti-document multilingual summarization corpus preparation, Part 2: Czech, Hebrew and Spanish
This document overviews the strategy, effort and aftermath of the MultiLing 2013 multilingual summarization data collection. We describe how the Data Contributors of MultiLing collected and generated a multilingual multi-document summarization corpus on 10 different languages: Arabic, Chinese, Czech, English, French, Greek, Hebrew, Hindi, Romanian and Spanish. We discuss the rationale behind th...
متن کاملExB Text Summarizer
We present our state of the art multilingual text summarizer capable of single as well as multi-document text summarization. The algorithm is based on repeated application of TextRank on a sentence similarity graph, a bag of words model for sentence similarity and a number of linguistic preand post-processing steps using standard NLP tools. We submitted this algorithm for two different tasks of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016